[GCS] Move node resource info from client table to resource table#5050
[GCS] Move node resource info from client table to resource table#5050raulchen merged 19 commits intoray-project:masterfrom
Conversation
|
Test PASSed. |
|
Test PASSed. |
|
Test PASSed. |
|
@raulchen @romilbhardwaj Any comments on this one? thanks |
|
Oops, I forgot to review this one. Will take a look soon. |
| } | ||
| NodeInfo nodeInfo = new NodeInfo( | ||
| clientId, data.getNodeManagerAddress(), true, resources); | ||
| clientId, data.getNodeManagerAddress(), true, new HashMap<>()); |
There was a problem hiding this comment.
nit, prefer using ImmutableMap.of()
| Preconditions.checkState(data.getEntryType() == EntryType.DELETION); | ||
| NodeInfo nodeInfo = new NodeInfo(clientId, clients.get(clientId).nodeAddress, | ||
| false, clients.get(clientId).resources); | ||
| false, new HashMap<>()); |
| try (Jedis jedis = jedisPool.getResource()) { | ||
| return jedis.hgetAll(key); | ||
| } | ||
|
|
There was a problem hiding this comment.
All the other methods in this class have a blank line at the end. Should I remove the blank lines of all methods?
There was a problem hiding this comment.
better also remove them, thanks.
python/ray/gcs_utils.py
Outdated
| "TablePubsub", | ||
| "Task", | ||
| "TaskTableData", | ||
| "RayResource", |
There was a problem hiding this comment.
maybe rename RayResource to ResourceTableData?
src/ray/raylet/node_manager.cc
Outdated
| // 1) Client cache is up to date. | ||
| // 2) Lookup resource table after subscribe. | ||
| io_service_.post([this]() { | ||
| // Fetch resource info for all clients and update cluster resource map. |
There was a problem hiding this comment.
Maybe I'm misunderstanding something.
Why do we need to fetch resource info for all clients here?
This is the callback of one node being added. Why not just fetch that node?
There was a problem hiding this comment.
It seems like this lambda is called only when the callback is for the local (self) raylet. This is the case when the raylet has been newly instantiated and has no resource state of other raylets. This lambda populates the resources in the raylet by querying the resource table and calling ResourceCreateUpdated on the resources of all nodes. Is that correct @kfstorm?
There was a problem hiding this comment.
@romilbhardwaj You are right. Previously the resource information is stored in the client table, so a client table notification carries resource update information. Now the two kinds of data are separated, we need some extra logic to make sure local resource information is up to date.
There was a problem hiding this comment.
I updated the code to lookup resources for only one client at the end of ClientAdded.
src/ray/raylet/node_manager.cc
Outdated
| worker = worker_pool_.GetRegisteredDriver(client); | ||
| std::unordered_map<std::string, std::shared_ptr<gcs::RayResource>> data_map; | ||
| auto ray_resource = std::make_shared<gcs::RayResource>(); | ||
| ray_resource->set_resource_name(resource_name); |
There was a problem hiding this comment.
I think the resource_name is redundant. Because the resource table is a Hash. The key is already the resource name.
There was a problem hiding this comment.
You mean we remove the resource_name field from RayResource structure? BTW, We can't replace RayResource structure with double because the Hash template class requires a protobuf structure as the value type to be able to serialize it.
There was a problem hiding this comment.
yes, we can have a ResourceTableData that only has a double field.
src/ray/raylet/node_manager.cc
Outdated
| // Delay execution to make sure, | ||
| // 1) Client cache is up to date. | ||
| // 2) Lookup resource table after subscribe. | ||
| io_service_.post([this]() { |
There was a problem hiding this comment.
how to guarantee that this lambda will run after the client cache is up to date?
There was a problem hiding this comment.
The execution of RequestNotifications against the client table will push all entries to raylet. Raylet will handle a single notification with multiple client entries and add alive and dead clients to the client cache. So here we lookup resources asynchronously to ensure all client entries are added to the client cache before the resource lookup operation.
There was a problem hiding this comment.
This change is confusing. I updated the code to lookup resources for only one client at the end of ClientAdded.
romilbhardwaj
left a comment
There was a problem hiding this comment.
Thanks for submitting this PR! I've added some comments
| return [node_info[client_id] for client_id in ordered_client_ids] | ||
|
|
||
|
|
||
| def _parse_resource_table(redis_client, client_id): |
There was a problem hiding this comment.
Nit: Considering client_id is stored/used in hex at most places in state.py and elsewhere, would it be more consistent to have the client_id argument as hex here and convert it to binary inside this method, rather than invoking it as _parse_resource_table(redis_client, ray.utils.hex_to_binary(client_id))?
There was a problem hiding this comment.
Sure. Sorry that I'm not familiar with state.py.
python/ray/state.py
Outdated
|
|
||
| Args: | ||
| redis_client: A client to the primary Redis shard. | ||
| client_id: The client ID of the node. |
There was a problem hiding this comment.
Would be good to clarify the encoding (hex/binary) of the client_id here
src/ray/raylet/node_manager.cc
Outdated
| // 1) Client cache is up to date. | ||
| // 2) Lookup resource table after subscribe. | ||
| io_service_.post([this]() { | ||
| // Fetch resource info for all clients and update cluster resource map. |
There was a problem hiding this comment.
It seems like this lambda is called only when the callback is for the local (self) raylet. This is the case when the raylet has been newly instantiated and has no resource state of other raylets. This lambda populates the resources in the raylet by querying the resource table and calling ResourceCreateUpdated on the resources of all nodes. Is that correct @kfstorm?
src/ray/raylet/node_manager.cc
Outdated
| // Delay execution to make sure, | ||
| // 1) Client cache is up to date. | ||
| // 2) Lookup resource table after subscribe. | ||
| io_service_.post([this]() { |
There was a problem hiding this comment.
Perhaps I'm missing something, but instead of using this lambda, why can't we update the resources of the remote clients at line 412 when they're added to the local raylet? Even if client cache is not up to date, subsequent notifications should handle update/delete in the local raylet.
There was a problem hiding this comment.
I just want to make sure the remote client is already registered in local raylet when receiving a resource update notification. Thus can prevent potential bugs in the resource update notification handling code path.
There was a problem hiding this comment.
After discussed with @raulchen, we think that it's OK to put the lookup at the end of ClientAdded and only lookup resources for one client. So I updated the code as you suggested to avoid the lambda.
src/ray/raylet/monitor.cc
Outdated
| } | ||
| if (!marked) { | ||
| RAY_CHECK_OK(gcs_client_.client_table().MarkDisconnected(client_id)); | ||
| // Remove all resources of this client from GCS |
There was a problem hiding this comment.
Maybe sometimes we'll want query the resources of a dead node. I think we can keep it. it won't waste too much memory.
src/ray/raylet/node_manager.cc
Outdated
| // 1) Client cache is up to date. | ||
| // 2) Lookup resource table after subscribe. | ||
| io_service_.post([this]() { | ||
| // Fetch resource info for all clients and update cluster resource map. |
|
Test PASSed. |
|
Test FAILed. |
|
Test PASSed. |
|
Test FAILed. |
|
Test FAILed. |
|
Test PASSed. |
What do these changes do?
This is the continuation of #4911. Previously node resource information is stored in client table. This PR moves it to resource table. This change makes client table more lightweight and simplifies the merge logic of client table because
RES_CREATEUPDATEandRES_DELETEare removed from code. Resource changes for a living node no need to update client table any more.Related issue number
Linter
scripts/format.shto lint the changes in this PR.